Dev by zTgx · Pull Request #76 · vectorlessflow/vectorless

zTgx · 2026-04-17T12:46:40Z

Summary

Changes

Checklist

Code compiles (cargo build)
Tests pass (cargo test --lib --all-features)
No new clippy warnings (cargo clippy --all-features)
Public APIs have documentation comments
Python bindings updated (if Rust API changed)

Notes

- Change crate description to "A reasoning-native document engine for AI" - Add support for structured documents including PDFs, Markdown, reports, and contracts - Rename client variable to engine in example code - Add endpoint configuration option to EngineBuilder - Reorganize module declarations with clear section headers - Move metrics module declaration to appropriate location - Restructure public API exports with categorized sections - Update example code formatting for better readability

- Change public modules to private in rust/src/lib.rs - Update import statements across codebase to use direct module paths instead of nested client module paths - Modify examples and documentation to reflect new import structure - Update README.md and all MDX documentation files with correct imports

…kpoint path logic - Remove unused workspace_dir field from Engine struct - Eliminate local workspace_dir variable in constructor - Directly construct checkpoint path from config instead of storing intermediate field - Remove workspace_dir from Clone implementation since field is removed This change simplifies the Engine struct by removing redundant state storage and directly accessing the workspace directory from config when needed.

- Consolidate imports by removing redundant module paths like `client::` and `graph::` from use statements across python binding files - Extract parallel source processing logic into separate `process_sources` method for better code organization - Simplify index method by removing unnecessary single-source conditional branch - Update documentation to reflect that multiple sources are indexed in parallel - Replace error string formatting with `.into()` for better performance - Modernize tracing warning syntax to use `{e}` instead of positional formatting - Expand public exports in rust lib.rs to include all graph types The changes improve code maintainability and consistency while maintaining the same functionality.

Removed the extensive example code block from the query method documentation to simplify the API reference and reduce documentation size.

- Remove unused graph module exports from document/mod.rs including DocumentGraph, DocumentGraphConfig, DocumentGraphNode, EdgeEvidence, GraphEdge, GraphMetadata, KeywordDocEntry, SharedKeyword, and WeightedKeyword - Remove unused Event export from events/mod.rs while keeping IndexEvent, QueryEvent, and WorkspaceEvent

Remove the graph module and its re-exports from document module as they are no longer needed for backward compatibility. BREAKING CHANGE: The graph type re-exports from crate::document are removed. Use direct imports from crate::graph instead.

…ne options - Extract common indexing and persistence logic into index_and_persist method - Add build_index_item helper method to construct IndexItem from IndexedDocument - Modify build_pipeline_options to accept IndexSource instead of format - Add build_pipeline_options_from_doc for persistence-specific configuration - Move rebuild_graph and extract_keywords_from_doc methods back to Engine - Update indexer methods to accept pre-built pipeline options instead of IndexOptions - Make detect_format_from_path public for internal use - Rename to_persisted_with_options to to_persisted as associated function

…ttern - Replace separate SummaryConfig and ConcurrencyConfig with unified LlmConfig that includes throttle settings - Add with_config method to EngineBuilder for advanced configuration - Simplify builder override logic to write to single config location instead of multiple places - Update documentation with both simple and advanced usage examples - Rename concurrency settings to throttle for clarity - Consolidate configuration merge logic into fewer types BREAKING CHANGE: SummaryConfig and ConcurrencyConfig have been removed and replaced with LlmConfig and ThrottleConfig

…dependencies - Remove unused `toml` dependency from both python and rust Cargo.toml files - Move throttle configuration from concurrency to llm.throttle namespace - Remove temperature setting method that was misplaced in retrieval config - Delete deprecated config loader and merge modules - Consolidate configuration types and update import paths

Remove unused re-exports of GraphMetadata, KeywordDocEntry, and SharedKeyword from the types module to clean up the public API.

The workspace field in Engine struct was changed from Option<WorkspaceClient> to WorkspaceClient directly, removing unnecessary optionality checks. BREAKING CHANGE: Engine now requires a workspace client to be provided during initialization instead of allowing it to be optional.

- Move IndexedDocument struct and implementation from types.rs to new indexed_document.rs module - Update imports in engine.rs and indexer.rs to use the new module - Add proper documentation for the internal IndexedDocument type - Keep IndexedDocument as internal-only type (not part of public API) The change separates concerns by moving the internal intermediate type to its own module, improving code organization. feat(storage): re-export PageContent from persistence module - Add PageContent to the re-exports in storage/mod.rs to make it available at the storage module level

…context BREAKING CHANGE: Removed include_reasoning and depth_limit options from QueryContext as they were no longer used in the retrieval logic. - Remove include_reasoning field and with_include_reasoning method - Remove depth_limit field and with_depth_limit method - Update tests to reflect removed functionality - Use DocumentFormat::SUPPORTED_EXTENSIONS as single source of truth for supported file extensions instead of hardcoded array

- Introduce AtomicBool flag to track when document graph needs rebuilding - Replace immediate graph rebuild in index() with lazy marking - Add lazy rebuild trigger in query() when graph_dirty flag is set - Change file operations from std::fs to tokio::fs for async I/O - Update IndexerClient::to_persisted to async function - Add clone implementation for graph_dirty field refactor(query): add pilot reasoning and depth limit options - Add include_reasoning field to control reasoning chain output - Add depth_limit field for maximum tree traversal depth - Include builder methods with_include_reasoning() and with_depth_limit() - Set default include_reasoning to true for backward compatibility

… indexer modules - Format multi-line use statements with proper indentation and trailing commas - Remove unnecessary line breaks in variable assignments for better conciseness - Reformat long method chains and function calls to span multiple lines for improved readability - Apply consistent formatting to match expressions and await calls across the codebase

- Remove AsyncEventHandler trait and related async handling logic - Delete async_handlers field from EventEmitterInner struct - Remove with_async_handler method and associated async processing - Clean up Event enum as it's no longer needed with the removed async handling - Update emit methods to only process synchronous handlers

…ngine - Add shared cancel flag and active operation tracking to Engine struct - Implement cancel() and reset_cancel() methods for operation control - Add timeout support for index() and query() operations via with_timeout() - Introduce ActiveGuard RAII pattern to manage active operation count - Add check_cancel() calls to index() and query() methods to respect cancellation - Remove unused MetricsHub from Engine and related modules refactor(client): update IndexOptions and QueryContext with timeout support - Add timeout_secs field to IndexOptions and QueryContext structs - Implement with_timeout_secs() builder methods for both contexts - Remove unused include_text field from IndexOptions - Remove deprecated ClientError enum refactor(workspace): improve document loading and workspace clearing - Simplify load() method by removing redundant workspace.contains() check - Enhance clear() method with proper error handling and accurate removal counting - Update logging and event emission for workspace operations refactor(client): remove unused EventEmitter and ClientError exports - Remove unused events field and EventEmitter dependency from Engine - Remove ClientError export from client module and root library

…tions BREAKING CHANGE: The include_text parameter has been removed from PyIndexOptions as it was no longer used in the indexing logic. feat(engine): add metrics hub and improve query performance - Add central MetricsHub for unified metric collection - Implement parallel querying with concurrency limits using buffer_unordered - Add metrics_report method to retrieve comprehensive metrics including LLM usage, pilot decisions, and retrieval operations refactor(storage): update checkpoint directory configuration - Add checkpoint_dir field to StorageConfig that defaults to workspace_dir/join("checkpoints") - Use the new checkpoint_dir configuration consistently

- Add tracing attributes feature to enable instrument macro usage - Add #[tracing::instrument] macros to key methods (index, process_source, query, run_pipeline, retriever::query) with appropriate field logging - Create test_support module with build_test_engine helper for no-LLM integration testing - Add IndexerClient::with_factory method for custom executor injection during testing - Implement comprehensive integration test suite covering full index/ persist/query lifecycle with various scenarios including single/ multiple sources, force/default modes, cancellation handling, and edge cases - Expose test utilities via vectorless::__test_support module for integration tests only

…logic - Add AtomicU32 import for tracking consecutive graph rebuild failures - Introduce GRAPH_REBUILD_MAX_FAILURES constant (3) to limit rebuild attempts - Add graph_fail_count field to Engine struct to track consecutive failures - Reset failure count when new items are indexed to allow fresh rebuild attempts - Implement index_with_retry method with configurable LLM retry parameters - Replace direct indexer calls with retry-wrapped versions in index operations - Add logic to skip graph rebuilds after reaching failure threshold - Reset failure count on successful graph rebuilds - Update Clone implementation to include graph_fail_count field feat(workspace): add warning for document overwrites during concurrent indexing - Log warning when saving documents with existing IDs to detect concurrent indexing - Include document ID and name in warning message for debugging feat(storage): add schema versioning for persisted documents - Introduce SCHEMA_VERSION constant (1) for backward compatibility tracking - Add optional schema_version field to PersistedDocument with default value - Set current schema version when creating new persisted documents - Validate schema version on document loading (warn for old, error for future) - Add schema version checks in both load_document_with_options and load_document_from_bytes_with_options

- Move atomic imports to maintain consistent ordering in engine.rs - Split long import statements across multiple lines for better readability - Break down long function calls and match expressions into multiple lines - Remove unnecessary line breaks in function definitions - Reformat complex nested expressions for improved clarity

- Add concurrency configuration to ReasoningIndexConfig in engine - Implement shared ConcurrencyController in LlmPool to manage concurrent requests across index, retrieval, and pilot clients - Initialize LLM clients with shared throttle controller from config - Set up proper concurrency limits using runtime configuration

- Introduce shared async-openai client across LLM executors to reuse connection pools and reduce resource overhead - Add RetryConfig::to_runtime_config method for proper configuration conversion between config and runtime types - Implement LlmClient::with_shared_openai_client method to inject shared client instances - Remove redundant prompt truncation logic and inline request building - Move retry logic into LlmError::is_retryable for unified handling - Consolidate retry configuration mapping in LLM pool initialization - Remove standalone retry module as functionality is now integrated into executor and error handling - Add fallback chain sharing across pool clients for consistent error recovery behavior

…oller - Remove unused current_endpoint variable from LlmExecutor::execute method - Update do_request method signature to remove endpoint parameter - Use self.config.endpoint directly instead of passing as parameter - Remove concurrency controller from LlmPool as it's no longer needed - Clean up related methods and tests that depended on concurrency control

- Introduce MetricsHub for centralized LLM call statistics collection - Add metrics recording for successful/failed calls, token usage, and timing - Track specific error types including rate limits, timeouts, and fallbacks - Remove deprecated overall_success_rate method from metrics report - Update LLM executor to record metrics during API calls and error handling - Modify LLM pool to support shared metrics hub across all client instances - Enhance EngineBuilder to integrate metrics hub into component initialization - Add comprehensive test coverage for metrics functionality

- Move ConcurrencyConfig and ConcurrencyController from root throttle module to llm::throttle submodule - Remove the standalone throttle module and integrate its functionality directly into the llm module - Update imports across related files to use the new path crate::llm::throttle instead of crate::throttle - Add comprehensive documentation and tests for the throttle functionality within the llm module

- Remove pub(crate) visibility from throttle exports in llm module - Update import paths in test modules to use absolute paths instead of relative imports - Change `use super::throttle::ConcurrencyConfig` to `use crate::llm::throttle::ConcurrencyConfig` in client.rs and executor.rs test modules

…ad safety - Move memo module from root to llm/memo subdirectory - Update import paths in enhance.rs, strategy.rs, llm_pilot.rs, pipeline_retriever.rs, and toc_navigator.rs - Replace AsyncRwLock with atomic statistics for better performance - Remove unnecessary locking in cache operations - Consolidate stats methods into synchronous lock-free implementation - Add load_from method to AtomicStats for restoring persisted data - Update MemoStore constructor to remove capacity parameter - Remove redundant comments and streamline code structure BREAKING CHANGE: Memo module is now located under llm::memo namespace

- Add MemoOpType variants for NodeEvaluation, SufficiencyCheck, ComplexityDetection, and QueryDecomposition - Implement caching for complexity detection with new ComplexityDetector memo store integration - Add query decomposition caching with serialization/deserialization logic for DecompositionResult - Integrate memo stores across retrieval stages (analyze, evaluate, search) to cache LLM evaluations - Add NodeEvaluation caching for both single and batch LLM evaluations - Implement sufficiency check caching in LlmJudge component - Update module exports to include MemoOpType in public API

vercel · 2026-04-17T12:46:46Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vectorless	Ready	Preview, Comment	Apr 17, 2026 0:46am

zTgx added 30 commits April 17, 2026 09:21

refactor(engine): remove example from query method documentation

e4a6e5a

Removed the extensive example code block from the query method documentation to simplify the API reference and reduce documentation size.

refactor(graph): remove unused re-exports from types module

073e033

Remove unused re-exports of GraphMetadata, KeywordDocEntry, and SharedKeyword from the types module to clean up the public API.

zTgx merged commit 4818c92 into main Apr 17, 2026
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev#76

Dev#76
zTgx merged 30 commits intomainfrom
dev

zTgx commented Apr 17, 2026

Uh oh!

vercel Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zTgx commented Apr 17, 2026

Summary

Changes

Checklist

Notes

Uh oh!

vercel Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant